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INTRODUCTION 

A  common  goal  in  multivariate  morphological  studies  is  to  compare  the 
shapes  of  the  organisms  under  study.  On  an  intuitive  level  the  distinction 
between  size  and  shape  is  obvious.  Most  people  could  agree  with  the  dictionary 
definitions  of  shape  as  the  relative  position  of  all  points  composing  the  outline 
or  external  surface  of  an  object  and  size  as  the  space  an  object  occupies.  But 
these  two  concepts  can  be  difficult  to  separate  in  multivariate  analyses  of  mor- 
phological data  (Sneath  &  Sokal,  1973).  The  data  sets  from  these  analyses  nor- 
mally consist  of  linear  measurements  on  a  series  of  morphological  characters 
such  as  skull  length,  tooth  row,  etc.  These  measurements  are  affected  by  both 
the  shape  and  the  size  of  an  organism.  The  goal  of  shape  analysis  is  to  separate 
these  two  parts  of  the  measurements  so  that  shape  comparisons  can  be  made 
between  organisms  of  different  sizes. 

This  paper  tests  the  effectiveness  of  several  techniques  that  have  been  devel- 
oped for  shape  analysis,  investigates  how  they  work,  and  shows  why  some 
methods  fail,  using  an  experimental  approach  similar  to  those  followed  by  Moss 
(1968,  1971),  Crovello  (1969),  Manischewitz  (1973),  Minkoff  (1965),  and  Rohlf 
(1972).  I  am  defining  effectiveness  as  the  ability  to  classify  objects  of  the  same 
shape  as  similar,  regardless  of  the  size  of  the  objects  (Moss,  1968;  Mosimann, 
1970).  This  is  in  contrast  to  other  kinds  of  shape  analyses  which  have  the  goal  of 
documenting  and  quantifying  consistent  changes  of  shape  with  increasing  size 
(Gould,  1966;  Sweet,  1980;  for  a  discussion  of  this  sort  of  analysis  in  conjunction 
with  allometric  growth,  or  documenting  the  morphological  distinctiveness  of 
predetermined  groups  of  organisms,  see  Albrecht,  1980). 

The  first  problem  in  testing  different  methods  of  shape  analysis  is  that  the 
relationships  of  shape  among  real  objects  are  unknown  and  can  only  be  inferred 
(Corruccini,  1973);  therefore,  no  criteria  exist  to  evaluate  methods  of  analyzing 
shape.  To  produce  objects  of  known  shape  relationships,  I  first  measured  11 
morphological  characters  on  eight  species  of  canids.  Then  I  made  artificial  oper- 
ational taxonomic  units  (OTUs)  with  known  shape  relationships  by  scalar  mul- 
tiplication of  the  data  from  two  canids,  kit  fox  and  wolf.  For  example,  all  11 
measurements  of  the  wolf  were  multiplied  by  a  constant  of  0.53  to  produce  an 
iso-OTU  about  the  size  of  a  kit  fox.  On  paper  at  least,  I  have  produced  two 
canids  of  different  size  but  of  the  same  shape.  Similar  scalar  multiplications 
were  performed  to  produce  a  mid-sized  wolf,  a  wolf-size  kit  fox,  and  a  mid- 
sized kit  fox.  Of  course,  isometric  enlargement  of  OTUs  is  rare  or  nonexistent  in 
real  biological  data,  but  my  procedure  has  the  strength  that  the  shape  re- 
lationships within  the  iso-wolves  and  the  iso-kit  foxes  are  known. 

Several  methods  of  shape  analysis  were  applied  to  the  canid  data  set,  such  as 
sizeout,  correlation,  ratio  distance,  regression  analysis,  and  log-sizeout.  My 
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research  indicates  that  the  best  results  are  obtained  from  ratio  distance,  log- 
sizeout,  and  correlation  of  log-transformed  data.  The  following  section  presents 
the  description  of  each  method  and  analysis  of  its  strengths  and  weaknesses. 

METHODS  OF  DATA  TRANSFORMATION 

In  general,  neither  the  means  nor  the  variances  of  characters  are  equal. 
Clearly,  total  head  and  body  length  measurements  will  have  a  larger  mean,  and 
probably  a  higher  standard  deviation,  than  a  measurement  taken  on  the  molar 
tooth  row.  This  produces  a  problem:  characters  with  large  means  and/or  vari- 
ances contribute  more  to  the  determination  of  shape  coefficients  between  OTUs 
than  do  characters  with  small  means  and/or  variances  (Sneath  &  Sokal,  1973).  A 
priori  there  is  no  reason  to  weight  these  characters  differently.  A  10%  change  in 
molar  tooth  row  is  not  necessarily  less  important  than  a  10%  change  in  head  and 
body  length.  However,  because  head  and  body  length  has  a  greater  absolute 
measurement,  it  will  tend  to  affect  the  shape  coefficients  more.  Some  methods 
must  be  found  to  equalize  the  effect  of  variables  before  a  shape  analysis  is  done. 
Two  methods  of  data  transformation  are  discussed  here.  One  method  is  stan- 
dardization (Sneath  &  Sokal,  1973).  The  standardized  score  of  a  character  can  be 
given  as:  Score  =  (x  -  x)/Sx,  where  x  is  the  mean  of  that  character  for  the  or- 
ganisms included  in  the  study  and  s*  its  standard  deviation.  Thus,  the  stan- 
dardized score  is  equivalent  to  the  z-value  used  in  statistics.  After  standardiza- 
tion, the  means  of  all  characters  are  0.0  and  the  standard  deviations  1.0.  Vari- 
ables then  contribute  equally  to  the  analysis.  Because  of  the  subtraction  of  the 
mean  and  division  by  the  standard  deviation,  the  standardized  score  of  a  mea- 
surement is,  in  part,  dependent  on  the  other  objects  included  in  the  study.  The 
same  organism  could  receive  very  different  scores  for  a  character  in  two 
analyses  which  contained  different  sets  of  organisms. 

Some  of  the  problems  produced  by  standardization  have  already  received 
attention.  Hudson  et  al.  (1966)  and  Sneath  &  Sokal  (1973)  pointed  out  that 
measurements  with  small  variances  may  be  heavily  affected  by  simple  mea- 
surement error.  The  magnification  of  these  errors  to  equal  status  with  the  other 
characters  through  standardization  would  be  a  mistake.  Also,  Rohlf  &  Sokal 
(1965),  Rohlf  (1962),  and  Underwood  (1969)  found  that  standardization  normally 
reduces  the  average  correlation  coefficient  between  OTUs.  Finally,  Sneath  & 
Sokal  (1973)  note  that  standardization  reduces  the  atypicality  of  aberrant  OTUs. 
Manischewitz  (1973)  tested  standardization  with  several  other  methods  of 
transformation  and  found  it  to  be  the  most  reliable  method.  However,  as  I  will 
discuss  below,  other  problems  also  affect  the  use  of  standardization  in  correla- 
tion analysis.  These  problems  are  so  severe  that  the  standardization  of  data  in 
correlation  analysis  is  impossible. 

There  are  other  methods  available  for  transforming  data;  one  of  the  most 
promising  is  log  transformation.  Logarithmic  transformation  has  long  been 
associated  with  normalizing  the  curve  shape  of  right  skewed  data;  however,  as 
pointed  out  by  Moriarty  (1977)  and  Lewontin  (1966),  the  effects  of  logarithmic 
transformation  go  beyond  normalization.  Perhaps  most  important  is  the  prop- 
erty that,  in  log-transformed  data,  the  standard  deviation  of  a  variable  is  pro- 
portional to  the  coefficient  of  variation  (CV)  of  the  untransformed  variable. 
Thus,  after  transformation,  each  variable  will  tend  to  contribute  to  the  analysis 
in  proportion  to  its  CV.  Because  the  mean  of  a  variable  may,  in  part,  determine 
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its  relative  contribution  to  the  analysis,  the  mean  of  the  transformed  data  can  be 
subtracted  from  the  log-transformed  data.  Other  properties  of  log-transformed 
data  are  discussed  in  the  section  on  log-sizeout  shape  analysis. 

METHODS  OF  SHAPE  ANALYSIS 

SlZEOUT 

In  morphometry  data  there  is  normally  a  high  intercharacter  correlation  with 
size;  lar*ge  animals  tend  to  have  large  tails,  skulls,  feet,  etc.  Thus,  when  viewing 
a  character  correlation  matrix  from  morphometric  data,  one  usually  notices  a 
great  many  high  positive  values,  especially  between  size-related  characters.  In  a 
principal  components  analysis  (PCA),  the  first  principal  component  is  defined 
as  that  vector  in  hyperspace  which  explains  the  maximum  possible  variation  of 
the  data.  Thus,  if  most  or  all  characters  are  highly  size  dependent,  the  first 
principal  component  will  be  highly  correlated  with  size  (Jolicoeur  &  Mosimann, 
1960;  Jolicoeur,  1963).  This  is  a  general  phenomenon  and  has  been  observed 
repeatedly  with  morphometric  data  (Sneath  &  Sokal,  1973).  Of  course,  it  is  not  a 
rule,  because  size  may  be  less  important  in  some  data  matrices  (i.e.,  the  first 
principal  component  may  not  be  highly  correlated  with  size). 

Once  it  is  determined  that  the  first  principal  component  is  a  size  factor,  the 
effect  of  this  axis  can  be  removed  mathematically  from  a  data  set  (Rohlf  et  al., 
1971).  Figure  1  may  help  the  reader  to  visualize  this  process.  Two  characters  are 
measured  on  a  series  of  canids,  and  a  principal  components  analysis  is  per- 
formed on  the  data.  As  can  be  seen  in  the  figure,  the  first  principal  component  is 
highly  size  related.  If  this  axis  is  eliminated,  only  the  second  principal  compo- 
nent is  left,  and  distances  between  OTUs  on  this  second  vector  represent  "size- 
out"  distances.  Multivariate  analyses  use  more  than  two  characters,  but  the 
method  remains  the  same:  the  first  principal  component  is  removed  and  dis- 
tances are  calculated  between  OTUs  based  on  all  the  remaining  components. 
These  sizeout  distances  can  be  interpreted  as  indicators  of  shape  similarity;  i.e., 
the  smaller  the  sizeout  distance  between  OTUs  the  greater  the  shape  similarity. 

Sizeout  analysis  depends  on  the  first  principal  component  being  the  size 
factor,  but  exactly  what  does  this  mean?  The  first  principal  component  is  a 
composite  of  all  the  original  variables,  and  those  characters  which  are  size 
related  are  generally  more  important  in  determining  the  first  principal  compo- 
nent. An  example  of  these  loadings  is  given  in  Table  1.  Notice  that,  although  all 
loadings  are  high,  indicating  that  the  first  principal  component  is  highly  size 
related,  all  characters  are  not  equally  high.  The  smallest  loading  is  for  cranial 
width;  this  means  that,  for  this  group  of  animals,  cranial  width  is  least  related  to 
the  linear  "size  factor"  (the  first  principal  component).  An  important  point  to 
remember  is  that  the  loadings  of  these  characters  are  determined  by  the  shapes 
of  the  animals  included  in  the  study.  With  another  group  of  animals  of  different 
shapes,  cranial  width  might  be  more  highly  correlated  with  the  first  principal 
component. 

The  loadings  for  skull  length  and  cranial  width  are  0.983  and  0.913,  re- 
spectively. A  graph  can  be  made  by  plotting  these  two  numbers  against  one 
another  to  define  the  sizeout  vector  for  these  two  character  axes  (fig.  1).  As 
already  discussed,  it  is  the  collapsing  of  this  first  size  factor  that  results  in 
sizeout.  Also  plotted  in  Figure  1  are  the  OTUs  on  which  the  PCA  was  per- 
formed. 

In  Figure  1  the  iso-OTUs  are  connected  to  obtain  an  iso-wolf  and  an  iso-kit 


Skull  Length 


Fig.  1.  Mechanism  of  the  sizeout  analysis  for  a  two-variable  case.  Notice  that  the 
iso-wolf  vector  (marked  iso-wolf)  is  parallel  to  the  first  principal  component  (PCA  1),  but 
the  iso-kit  fox  vector  (iso-kit)  is  not.  This  leads  to  the  incorrect  shape  relationships  found 
among  the  iso-kit  foxes. 


Table  1.  The  loadings  of  the  morphological  characters  on  the  first  principal  component 
obtained  by  a  PCA  on  the  canid  data  set. 


Skull  length 

0.983 

Tooth  row  length 

0.929 

Molar  tooth  row  length 

0.986 

Width  between  tooth  rows 

0.970 

Zygomatic  width 

0.982 

Nasal  length 

0.979 

Cranial  width 

0.913 

Dentary  length 

0.987 

Coronoid  height 

0.987 

Width  between  incisors 

0.953 

Dentary  thickness 

0.937 
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fox  line.  Because  the  data  have  been  standardized,  all  the  kit  foxes  and  wolves 
no  longer  lie  on  a  straight  line.  For  the  purpose  of  this  example,  I  have  con- 
nected the  largest  and  smallest  iso-OTUs.  The  wolf  line  is  more  similar  in  slope 
to  the  sizeout  vector  than  is  the  kit  fox  vector;  because  of  the  mathematics  of 
sizeout  analysis,  the  more  similar  the  slope  of  an  iso-OTU  vector  is  to  the 
sizeout  vector,  the  more  perfectly  the  analyses  will  remove  the  effect  of  size 
from  data.  This  can  be  seen  graphically  if  one  imagines  the  collapse  of  the 
sizeout  vector  to  the  origin,  leaving  only  the  second  principal  component.  All 
three  wolves,  because  of  the  nearly  parallel  nature  of  the  iso-wolf  vector,  col- 
lapse to  nearly  one  point  on  the  second  vector;  however,  this  is  not  true  for  the 
iso-kit  foxes.  This  is  only  a  two-dimensional  example,  but  when  all  11  charac- 
ters are  considered,  the  sizeout  procedure  is  still  less  effective  for  the  kit  foxes 
than  for  the  wolves.  In  fact,  the  distance  between  the  largest  and  smallest  iso-kit 
foxes  is  in  the  upper  46%  of  all  distances  in  the  sizeout  distance  matrix  (table  2). 
This  magnitude  of  error  indicates  that  sizeout  can  produce  significant  changes 
in  results.  Iso-enlargements  do  not  exist  in  real  studies,  but  the  problem  re- 
mains. Sizeout  cannot  remove  the  effect  of  size  from  all  groups  equally,  and  the 
ability  of  sizeout  to  function  properly  will  tend  to  decrease  as  animals  of  more 
diverse  shapes  are  included  in  the  analysis.  In  this  example,  the  shape  re- 
lationships of  the  iso-wolves  were  correctly  calculated,  whereas  incorrect  an- 
swers were  obtained  for  the  iso-kit  foxes.  In  another  study,  which  included 
more  canids,  the  sizeout  analysis  also  failed  on  wolves.  Evidently,  the  first 
principal  component  was  shifted  and  no  longer  parallel  to  the  iso-wolf  vector  in 
the  second  analysis. 

Correlation 

Shape  analysis  using  correlation  (Michener  &  Sokal,  1957)  involves  calculat- 
ing the  inter-OTU  product-moment  correlation  coefficient  for  all  possible  pairs 
of  OTUs  in  a  study.  This  coefficient  is  calculated  by  the  formula: 

X  (X,,  -  X,)(X,2  -  X2) 

h.2  = 


2  (x„  -  X,)2E  (X,2  -  X2)2 

where  Xj.,  is  the  measurement  of  the  ith  character  on  the  first  OTU,  X,  is  the 
mean  of  all  measurements  on  OTU,,  and  r,.j  is  the  product-moment  correlation 
coefficient  between  OTU,  and  OTU>. 

A  scattergram  (fig.  2)  can  give  a  graphical  representation  of  this  correlation  for 
a  pair  of  OTUs  (dog  and  wolf).  A  high  positive  correlation  between  two  OTUs 
may  indicate  great  similarity  in  shape,  whereas  low  values  mean  little  similarity 
(Rohlf  &  Sokal,  1965). 

One  of  the  problems  already  pointed  out  by  Rohlf  &  Sokal  (1965)  with  the  use 
of  correlation  analysis  is  that  OTUs  may  be  highly  correlated  (r  =  1.0)  and  not  be 
of  similar  shape.  Their  example  illustrates  this  point:  the  measurements  1-2-3 
on  object  1  would  correlate  perfectly  with  the  measurements  1-2-3  on  object  2  or 
2-4-6  on  object  3,  and  this  is  consistent  with  our  ideas  on  shape  relationship; 
however,  object  4  with  the  measurements  101-102-103  would  also  be  perfectly 
correlated  with  OTU  1,  even  though  these  two  OTUs  are  not  the  same  shape. 
Although  this  problem  has  been  noted,  its  real  impact  in  an  analysis  is  difficult 
to  determine. 
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Fig.  2.  The  morphological  measurements  of  two  OTUs,  wolf  and  dog,  are  plotted 
against  one  another  to  form  this  scattergram.  The  data  have  not  been  standardized; 
therefore,  the  correlation  coefficient  is  high  (r  =  0.993).  Standardization  reduces  the 
r-value  to  0.375. 


The  importance  of  a  character  in  determining  r  is  strongly  affected  by  the 
mean  value  of  the  characters;  for  this  reason,  some  data  transformation  is 
needed.  The  two  methods  discussed  here  are  standardization  and  log  transfor- 
mation. 


Standardization 

Tables  2  and  3  give  the  results  of  correlation  analysis  of  two  separate  stan- 
dardized data  sets,  both  of  which  include  the  iso-wolves  and  iso-kit  foxes.  It  is 
clear  from  these  tables  that  the  results  of  these  correlation  analyses,  which 
should  be  based  on  shape  alone,  give  inconsistent  results  with  known  re- 
lationships of  the  iso-OTUs.  The  differences  in  results  indicate  two  things:  (1) 
size  is  evidently  important  in  determining  product-moment  correlation 
coefficient  because,  in  every  case  within  an  analysis,  iso-OTUs  which  are  more 
similar  in  size  have  larger  r-values  than  do  iso-OTUs  of  very  different  sizes 
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(Moss,  1968);  and  (2)  the  results  of  correlation  analysis  are  not  consistent  from 
run  to  run. 

There  are  a  variety  of  reasons  for  these  failures  (also  see  Minkoff,  1965,  for 
discussion  of  correlation  analysis  and  allometric  growth).  One  of  the  most  im- 
portant reasons  has  to  do  with  standardization  process.  The  iso-kit  foxes  illus- 
trate how  this  problem  arises.  The  small  kit  fox  is  the  smallest  canid  in  the 
study;  thus,  its  skull  length  and  dentary  thickness  are  both  relatively  small. 
When  the  large  kit  fox  data  set  was  created,  its  skull  length,  by  definition, 
became  equal  to  that  of  the  largest  canid,  the  wolf.  However,  the  dentary  thick- 
ness of  a  jaw  of  a  wolf  is  proportionally  much  greater  than  that  of  a  kit  fox  due  to 
allometric  growth.  Therefore,  even  though  the  skull  of  the  large  iso-kit  fox  is  as 
large  as  that  of  the  real  wolf,  its  dentary  is  much  thinner.  When  the  data  are 
standardized,  the  small  kit  fox  receives  very  small  scores  for  both  skull  length 
and  dentary  thickness  (—1.1078  and  —1.132,  respectively).  In  contrast,  the  large 
kit  fox  receives  a  large  score  for  skull  length  (1.678),  but  only  a  slightly  higher 
than  average  score  (0.588)  for  its  dentary  thickness  (as  compared  with  2.269  for 
the  wolf)-  When  looking  at  only  these  two  traits,  the  small  kit  fox  has  stan- 
dardized scores  which  indicate  that  its  skull  length  and  jaw  thickness  are 
equally  small  (the  ratio  of  jaw  thickness  to  skull  length  is  1.05).  This  is  in  sharp 
contrast  to  the  large  kit  fox  whose  scores  indicate  a  large  skull  length  but  much 
smaller  jaw  thickness  than  would  have  been  predicted  from  the  scores  of  the 
small  kit  fox  (the  ratio  for  the  large  kit  fox  is  0.348).  This  difference  and  others 
reduce  the  correlation  between  the  small  and  large  kit  fox  from  the  expected 
value  of  1.0  to  0.553  (table  1). 

The  result  just  cited  is  from  a  data  set  where  most  of  the  animals  were  fox- 
like. In  a  second  data  set  where  more  wolf-like  OTUs  were  added,  the  normal 
mid-  to  large-sized  animals  had  a  much  thicker  dentary  than  before;  the  mean 
dentary  thickness  shifted  from  19.3  mm  in  the  first  run  to  22.2  mm  in  the  second 
analysis.  The  result  is  that  the  large  kit  fox  looks  all  the  more  peculiar  after 
standardization  in  the  second  run.  This  time  the  scores  of  the  small  kit  fox  were 
skull  length,  —1.315;  and  dentary  thickness,  —1.371  (the  ratio  is  1.04).  The 
scores  of  the  large  kit  fox  were  skull  length,  1.460;  and  dentary  thickness,  0.203 
(a  ratio  of  0.139).  Clearly,  according  to  the  standardized  scores,  these  two  iso- 
OTUs,  at  least  for  this  trait,  appear  to  have  different  shapes. 

Log  Transformation 

The  results  of  the  correlation  analysis  on  log-transformed  data  are  shown  in 
Table  3.  These  results  are  consistent  with  the  known  shape  relationships  of  the 
iso-OTUs. 

Ratios 

Distance  in  standardized  ratio  space  has  a  unique  property  as  compared  with 
that  in  the  previous  methods  in  that  ratios  are  shape  characters  (Simpson  et  al., 
1960;  Corruccini,  1975).  The  axes  in  hyperspace  are  now  shape  axes  instead  of 
simple  raw  data  axes,  and  any  change  along  an  axis  now  represents  an  actual 
change  in  shape.  Because  ratios  formed  with  size-related  characters  as  de- 
nominators are  shape  measures,  there  is  no  need  for  a  method  to  remove  the 
effect  of  size  further.  It  is  true  that  a  ratio  may  be  highly  correlated  with  linear 
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size,  and  this  is  often  related  to  allometric  growth  patterns.  For  instance,  large 
canids  generally  have  relatively  narrow  crania  as  compared  with  small  canids. 
Therefore,  a  negative  correlation  exists  between  the  ratio  of  cranial  width/skull 
length  and  some  measure  of  size  such  as  skull  length  or  head  and  body  length. 
This  relationship  should  not  be  confused  with  the  size  correlation  discussed 
under  sizeout  analysis.  The  correlation  between  the  cranial  width/skull  length 
ratio  and  size  represents  the  allometric  change  of  shape  with  size,  whereas  the 
positive  linear  correlation  between  cranial  width  and  skull  length  represents  the 
degree  of  maintenance  of  shape  with  increasing  size. 

As  with  sizeout  analysis,  distances  computed  in  ratio  space  measure  the 
similarity  in  shape  of  OTUs;  small  distances  indicate  high  shape  similarities. 
Standardization,  which  had  such  a  detrimental  effect  on  correlation,  does  not 
adversely  affect  distance  in  ratio  space.  The  results  of  ratio  analysis  on  the 
iso-OTUs  are  entirely  consistent  with  our  understanding  of  their  true  shape 
relationships  (table  4).  The  only  effect  standardization  has  is  to  assess  dif- 
ferences in  shape  relative  to  the  total  difference  present  in  the  data  matrix. 
Thus,  if  two  OTUs  vary  greatly  on  the  ratio  axis  on  which  other  OTUs  do  not, 
these  two  OTUs  would  be  more  distant  than  if  the  other  OTUs  that  were  in- 
cluded in  the  analysis  varied  greatly  on  this  axis  as  well.  This  may  not  be  a 
problem  as  long  as  one  remembers  that  distance  in  ratio  space  after  stan- 
dardization is  relative.  In  many  cases  relative  shape  distances  are  sufficient; 
however,  if  some  absolute  measure  is  needed,  then  standardization  cannot  be 
used. 

It  should  be  noted  here  that  Atchley  et  al.  (1976)  have  recently  strongly  ques- 
tioned the  use  of  ratios.  Their  article  was  equally  vigorously  rebutted  by  Hills 
(1978),  Dodson  (1978),  and  Albrecht  (1978);  but  see  Atchley  (1978)  and  Atchley  & 
Anderson  (1978).  At  this  point  it  is  certainly  safe  to  say  that  ratios  are  con- 
troversial. My  own  view  from  reading  all  the  papers  concerned  is  that  reason- 
able, thoughtful  use  of  ratios  may  still  be  a  powerful  tool. 

Regression-Residual  Analysis 

Several  methods  are  combined  under  this  topic.  All  involve  using  the  data  to 
generate  a  vector  that  is  size  related  and  then  removing  the  effect  of  that  vector 
on  the  data  set.  Methods  that  can  be  used  here  are  regression,  reduced  major 
axis,  and  partial  correlation.  This  technique  is  similar  to  sizeout,  except  that 
now  one  variable  is  specified  as  the  size  factor.  The  general  idea  is  to  regress 
this  variable  against  all  other  variables.  The  residuals  from  this  analysis  indicate 
the  relative  size  of  each  variable.  This  method  suffers  from  the  same  problem  as 
sizeout.  The  slope  and  intercept  of  the  regression  lines  are  affected  by  the  OTUs 
included  in  the  study.  As  in  the  sizeout  example,  the  iso-wolves  and  iso-foxes 
are  not  affected  equally  by  this  analysis  (table  4).  Once  again  the  iso-wolves  are 
more  effectively  characterized  than  the  iso-kit  foxes. 

This  kind  of  analysis,  often  in  conjunction  with  log  transformation,  can  be 
useful  in  removing  the  effects  of  allometric  growth  from  data.  In  such  cases,  the 
effect  of  consistent  changes  in  shape  can  be  removed.  However,  care  should  be 
used  to  know  exactly  what  is  being  removed  from  an  analysis.  At  the  extreme,  a 
researcher  has  used  this  approach  (partial  correlation)  to  remove  the  effect  of 
size  from  chromosome  number  and  color  pattern. 
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LoG-SlZEOUT 

The  log-sizeout  method  is  similar  to  the  sizeout  and  regression-residual 
methods  in  that  the  effect  of  a  size  vector  is  removed  from  the  data  matrix.  The 
difference  in  the  log-sizeout  method  is  that  the  size  vector  is  not  determined  by 
the  data  matrix.  This  is  possible  because,  after  a  data  matrix  is  log  transformed, 
all  the  iso-OTU  lines  have  equal  slopes  in  hyperspace.  A  simple  bivariate  plot 
can  illustrate  this.  In  Figure  1,  skull  length  and  cranial  width  are  plotted.  Notice 
once  again  the  two  iso-OTU  lines  are  of  different  slopes.  Because  of  the  dif- 
ference in  slope,  no  single  vector  can  remove  the  effect  of  size  from  both  of  these 
groups.  However,  when  the  data  are  log  transformed,  both  iso-OTU  lines  now 
have  a  slope  of  1,  with  different  intercepts  (fig.  3).  Log-sizeout  removes  the 
effect  of  size  by  removing  the  effect  of  a  vector  with  slope  1  in  all  dimensions 
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Fig.  3.  Once  the  data  set  has  been  log  transformed,  all  iso-OTU  vectors  are  parallel  and 
have  slope  1.00.  Compare  this  situation  with  standardized  data  plotted  in  Figure  1. 
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related  to  size  and  zero  slope  for  variables  that  are  not  size  related.  The  effect  of 
this  vector  can  be  removed  from  the  distance  matrix  by  the  equation: 

D,,,^,,,,  ,2  =  VD,OK  ,.a2  -  (NX,,,,  ,  -  NXUlR  2y 

where  N  is  equal  to  the  number  of  size-related  variables,  X|„K ,  is  the  mean  of  the 
log-transformed  size  variables,  and  D|„B  1.2  is  the  euclidean  distance  between 
two  OTUs  in  log-transformed  space.  The  quantity  NX|OK  ,  is  equal  to  the  loca- 
tion of  an  object,  on  the  size  vector.  This  location  is  one  definition  for  size.  The 
equation  can  be  read  as  the  distance  from  object  1  to  object  2  in  log -transformed 
hyperspace  minus  their  difference  in  size,  which  leaves  only  the  shape  dif- 
ferences to  affect  distance. 

Analysis  of  the  canid  data  indicates  that  all  iso-OTUs  are  correctly  classified 
(table  5). 

CONGRUENCE  OF  TECHNIQUES 

Another  way  of  comparing  methods  of  shape  analysis  is  to  see  how  similarly 
the  methods  assess  shape  relations  among  a  group  of  OTUs.  One  way  to  do  this 
is  to  correlate  the  similarity  (or  dissimilarity)  matrices  of  all  the  techniques. 
Several  of  these  scattergrams  are  shown  in  Figure  4a-h,  and  Table  6  contains  all 
the  "cophenetic"  correlation  coefficients  between  methods.  Not  surprisingly,  in 
view  of  the  evidence  already  presented,  the  correlation  between  methods  is  not 
always  particularly  high.  As  an  example,  two  methods  mentioned  as  pos- 
sibilities by  Sneath  &  Sokal  (1973),  sizeout  and  correlation,  are  correlated  with 
an  r-value  of  only  0.623  (fig.  4a).  To  get  an  overview  of  the  relationships  of  the 
methods,  I  constructed  a  phenogram  based  on  the  correlations  (fig.  5).  The  most 
striking  point  of  this  phenogram  is  that  techniques  using  similar  methods  give 
similar  results. 

The  sizeout  and  regression  methods  are  similar  mathematically  in  their  way 
of  removing  a  size  vector.  Consistent  with  that  fact  is  that  the  shape  re- 
lationships generated  by  these  two  methods  are  most  similar  to  one  another. 
Likewise  with  the  correlation  of  standardized  data  and  log-transformed  data, 
the  mathematics  are  most  similar  and  the  results  are  more  similar  to  one  another 
than  to  the  other  methods.  The  log-sizeout  method  is  harder  to  classify  mathe- 
matically. On  one  hand  it  can  be  visualized  as  the  removal  of  a  size  vector  of 
slope  1  in  all  size-related  dimensions,  much  the  same  as  sizeout.  On  the  other 
hand  it  resembles  the  ratio  technique,  as  it  entails  the  subtraction  of  logarithms 
(a  process  closely  related  to  division  of  non-log  numbers).  The  phenogram 


Table  6.  The  congruence  of  the  methods  of  shape  analysis  discussed  in  this  report  can 
be  seen,  using  these  inter-method  correlation  coefficients. 
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Fig.  4.  The  shape  coefficients  of  the  different  methods  are  plotted  against  one  another  in 
scattergrams  to  obtain  a  visual  representation  on  the  congruence  of  these  techniques,  a, 
Comparison  of  distance  of  OTUs  in  ratio  space  and  log-sizeout  space;  this  yielded  the 
highest  correlation,  r  =  0.959.  b,  Comparison  of  log-sizeout  with  the  correlation 
coefficients,  using  log-transformed  data,  c,  Comparison  of  the  log-sizeout  method  with 
regular  sizeout;  notice  the  wide  scatter  (r  ■  0.667)  between  these  two  methods.  Along  the 
vertical  axis  are  several  circled  points;  these  are  the  iso-OTU  comparisons  where  shapes  of 
the  OTUs  are  known  to  be  identical.  Although  log-sizeout  correctly  classifies  these  re- 
lationships, sizeout  does  a  poor  job.  d,  Comparison  of  distances  in  ratio  space  with 
distances  in  sizeout  space,  e,  Comparison  of  distances  in  ratio  space  and  correlation 
coefficients  on  standardized  data.  As  mentioned  in  the  text,  the  correlation  method  failed 
here,  but  comparison  with  graph  c  shows  that  correlation  was  more  effective  than  sizeout 
in  classifying  iso-OTUs.  f,  Comparison  of  the  correlation  coefficients  on  log-transformed 
data  with  distances  in  ratio  space,  g,  Comparison  of  the  correlation  coefficients  on  log- 
transformed  data  with  the  correlation  coefficients  on  standardized  data;  notice  the  tight 
relationship  (r  =  0.924),  which  indicates  the  similarity  of  the  two  methods,  h,  Compari- 
son of  the  correlation  coefficients  on  standardized  data  with  distance  in  sizeout  space; 
notice  the  surprisingly  low  correlation  of  these  two  techniques  (r  =  -0.623). 
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Fig.  5.  A  phenogram  of  the  methods  of  shape  analysis  studied  in  this  report  was  made 
based  on  the  inter-method  correlations  of  Table  6.  The  phenogram  shows  the  methods  fall 
into  three  groups,  with  the  groups  of  correlation  of  log-transformed  data  (LOG  CORR) 
and  the  correlation  of  standardized  data  (STAN  CORR)  and  ratio  distances  (RATIO)  and 
distances  in  log-sizeout  space  being  most  similar.  Sizeout  distances  (SIZEOUT)  and  the 
regression  method  (REGRESSION)  are  more  distantly  related. 
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indicates  that  the  results  of  log-sizeout  agree  most  closely  with  those  of  ratio 
technique  and  are  only  distantly  related  to  results  of  sizeout  or  regression. 

SUMMARY 

Six  methods  of  shape  analysis  were  tested,  and  half  of  them — sizeout,  regres- 
sion analysis,  and  correlation  of  standardized  data — failed  to  classify  iso-OTUs 
correctly.  Three  others — correlation  of  log-transformed  data,  log-sizeout,  and 
ratio  distance — all  passed  my  test  equally  well.  Even  among  the  successful  tests, 
however,  there  are  considerable  differences  in  the  estimated  shape  re- 
lationships. Based  on  the  data  presented  here,  it  is  not  possible  to  distinguish  if 
one  of  the  three  methods  is  more  successful  than  the  others,  or  if  differences  in 
results  represent  different  views  of  shape,  internally  consistent  and  equally 
valid. 

Having  said  this,  I  will  state  that  my  own  slight  preference  is  for  the  log- 
sizeout  methods.  Correlation,  even  with  log  transformation,  suffers  from  some 
conceptual  problems  mentioned  above.  Ratios,  which  on  the  surface  appear  the 
most  satisfactory  of  shape  indices,  have  been  attacked  as  producing  spurious 
correlations  (Atchley  et  al.,  1976). 

I  believe  that  studies  such  as  this  one  are  necessary  for  critical  judgment  of 
methods  of  shape  analysis.  Using  iso-OTUs  has  proved  partially  successful  in 
giving  insights  into  how  these  methods  work,  and  sometimes  fail  to  work. 
More  studies,  perhaps  with  more  sophisticated  artificial  OTU  shape  re- 
lationships, are  needed  to  continue  this  work. 
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